5 research outputs found
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
Feedback Control of Climate Dynamics
Mentor: Jr-Shin Li
From the Washington University Undergraduate Research Digest: WUURD, Volume 6, Issue 1, Fall 2010. Published by the Office of Undergraduate Research.
Henry Biggs, Director of Undergraduate Research and Associate Dean in the College of Arts & Sciences; Joy Zalis Kiefer, Undergraduate Research Coordinator, Co-editor, and Assistant Dean in the College of Arts & Sciences; Kristin Sobotka, Editor
Learning to tutor from expert demonstrators via apprenticeship scheduling
We have conducted a study investigating the use of automated tutors for educating players in the context of serious gaming (i.e., game designed as a professional training tool). Historically, researchers and practitioners have developed automated tutors through a process of manually codifying domain knowledge and translating that into a human-interpretable format. This process is laborious and leaves much to be desired. Instead, we seek to apply novel machine learning tech-niques to, first, leam a model from domain experts' demonstrations how to solve such problems, and, second, use this model to teach novices how to think like experts. In this work, we present a study comparing the performance of an automated and a traditional, manually-constructed tutor. To our knowledge, this is the first investigation using learning from demonstration techniques to learn from experts and use that knowledge to teach novices
Apprenticeship scheduling: Learning to schedule from human experts
Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the "singleexpert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state-space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on both a synthetic data set incorporating jobshop scheduling and vehicle routing problems and a real-world data set consisting of demonstrations of experts solving a weapon-to-target assignment problem.National Science Foundation (U.S.). Graduate Research Fellowship Program (grant number 2388357
Human-machine collaborative optimization via apprenticeship scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the "single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes. We propose a new approach for capturing this decision-making process through counterfactual reasoning in pairwise comparisons. Our approach is model-free and does not require iterating through the state space. We demonstrate that this approach accurately learns multifaceted heuristics on a synthetic and real world data sets. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of schedule optimization. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates optimal solutions up to 9.5 times faster than a state-of-the-art optimization algorithm